**** Online appendix to the paper "A tutorial on the use of differences-in-differences in management, finance and accounting research" published by RAC (2021)
**** Authors: (1) Schiozer, R.F (rafael.schiozer@fgv.br); (2) Mourad, F.A. (fmourad@gmail.com) & (3) Martins, T. C. (theocm@gmail.com)
**** Affiliations: (1)Fundacao Getulio Vargas/EAESP; (2) Banco Central do Brasil; (3) Fundacao Getulio Vargas/EAESP and Banco Central do Brasil


***********************************************************************************************************
*******************  INSTRUCTIONS *************************************************************************
***********************************************************************************************************

****     1. Copy this do-file and database file into the same directory (folder)
****     2. Change the working directory to the folder above (see line 9)


cd ""  // SHOULD CHANGE TO YOUR WORKING DIRECTORY HERE (e.g. "C:\data")


/*

** Code Summary

This code (sequentially) provides all descriptive statistics, the main text regressions, as 
well as unreported results.

This code was tested in Stata 14.2, but it should work on newer versions as well. It 
uses the following Stata default commands: "local", "tabstat", "ranksum", "ttest", "keep",
"preserve", "restore", "collapse", "replace", "twoway", "reg", "areg", "egen", "graph export". 
If you have questions about these commands, please consult Stata User’s guide
by calling "help 'command'" in the command window (e.g. "help preserve").


Notes to regression models:

    You can prefix a variable with i. to specify indicators for each level (category) of the variable.
    You can put a # between two variables to create an interaction–indicator for each combination of the categories of the variables.
    You can put ## instead to specify a full factorial of the variables—main effects for each variable and an interaction.
    If you want to interact a continuous variable with a factor variable, just prefix the continuous variable with c..

	For more information on the notation above, please consult Stata User's guide by calling "help fvvarlist" in the command window.



This code also uses the following non-default packages:

	"estout": tools for making regression tables in a more friendly way (i.e. paper style). 
	For more information, please refer to http://repec.sowi.unibe.ch/stata/estout

	"coefplot": plots the coefficients estimated from a regression model along with its 
	confidence intervals. For more information, please refer to 
	http://repec.sowi.unibe.ch/stata/coefplot 
	
*/


** Installing non-default packages
** ssc install estout
** ssc install coefplot


** Stata parameters definitions
cls 				// Cleaning Results window
clear all 			// Cleaning Memory
set matsize 800 	// Sets the maximum number of variables that can be included
set more off		// Prevent Stata from asking "see more" on Results window


** Loading the database (file should be in the current working directory)
use "RAC.dta", clear	// open raw database



***********************************************************************************************************
*******************  MAIN TEXT ANALYSIS *******************************************************************
***********************************************************************************************************


******
******   Bank Descriptive Statistics - Table 1
******

**** Statistics by period (pre-crisis / crisis) and group (protected / unprotected)
local vars w_ln_zScore liquidAssetsToStFunding assets // "vars" is a group of variables that will be used
estpost tabstat `vars' if year >= 2005 & year <= 2007, statistics(mean median sd n) by(treated) nototal columns(statistics) // descriptive statistics for the pre-crisis period (i.e. 2005-2007)
esttab . using "Table1.csv", replace scsv cells("mean p50 sd count") noobs unstack // export table to file

estpost tabstat `vars' if year >= 2008 & year <= 2010, statistics(mean median sd n) by(treated) nototal columns(statistics) // descriptive statistics for the post-crisis period (i.e. 2008-2010)
esttab . using "Table1.csv", append scsv cells("mean p50 sd count") noobs unstack // export table to file


**** Median comparison tests between groups (Wilcoxon rank-sum tests) for each variable
**** Note: refer to Appendix for (unreported) standard t-tests

** ln(Z_Score)
ranksum w_ln_zScore if year>=2005 & year<=2007, by (treated) 
ranksum w_ln_zScore if year>=2008 & year<=2010, by (treated) 
** Liquidity
ranksum liquidAssetsToStFunding if year>=2005 & year<=2007, by (treated) 
ranksum liquidAssetsToStFunding if year>=2008 & year<=2010, by (treated) 
** Total Assets
ranksum assets if year>=2005 & year<=2007, by (treated) 
ranksum assets if year>=2008 & year<=2010, by (treated) 



******
******   Parallel Trends Checking - Figure 2
******

** Preserve original dataset
preserve

** Calculate mean of ln(Z_Score) by year and group (protected / unprotected)
collapse (mean) w_ln_zScore if year <= 2010, by(year treated)

** Auxiliary variable to highlight the pre-crisis and the crisis periods
egen max = max(w_ln_zScore)
replace max = ceil(max*10)/10

** Draw graph
twoway area max year if year <= 2007, lc(bg) fc(gscale1) fintensity(inten30) || ///
       area max year if inrange(year, 2008, 2010), lc(bg) fc(gscale1) fintensity(inten10) || ///
       (scatter w_ln_zScore year if treated == 0,msymbol (circle) mcolor(red) msize(medium)) || ///
       (scatter w_ln_zScore year if treated == 1, msymbol(square) mcolor(blue)msize(medium)), ///
       legend(order(4 3 2 1) label(4 "Protected") label(3 "Unprotected") label(2 "Crisis") label(1 "Pre-Crisis")) ///
       xlabel(2005(1)2010) xtitle("Year") ytitle("Ln(Z_Score)") 

** Graph Export
graph export "Figure_2.png", as(png) replace
	   
** Restore original dataset
restore


******
******   DID Estimator (2005 - 2010) - Table 2
******
** Define list of control variables
local bankCov_pre2007 ln_assets_pre2007 liquidAssetsToStFunding_pre2007 // Bank Control Variables (as of 2007)
local countryCov_pre2007 HHI_pre2007 gdpPerCapita_pre2007 LgdpGrowth_pre2007 L2gdpGrowth_pre2007 creditToPrivate_pre2007 // Country Control Variables (as of 2007)

** Column 1: 	No Fixed Effects or Controls
eststo: reg w_ln_zScore c.crisis##c.treated if year <=2010, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 2: 	No Fixed Effect, with Bank Covariates
eststo: reg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' if year <=2010, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 3: 	No Fixed Effect, with Bank and Country Covariates
eststo: reg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' `countryCov_pre2007' if year <=2010, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 4: 	With Country FE and Bank Covariates
eststo: areg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' if year <=2010, absorb(countryIndex) cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "yes"
estadd local year_fe "no"

** Column 5: 	With Bank FE (exclude banks with exactly one observation, i.e. "N_indexNumber" > 1 )
egen N_indexNumber = count(indexNumber) if year<=2010, by (indexNumber)
eststo: areg w_ln_zScore crisis c.crisis#c.treated if year <=2010 & N_indexNumber > 1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "no"

** Column 6: 	With Bank FE and yearly dummies
eststo: areg w_ln_zScore c.crisis#c.treated i.year if year <=2010 & N_indexNumber > 1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "yes"

** Export table to file
esttab using "Table2.csv", replace scsv scalars(bank_fe country_fe year_fe N_clust) se r2 drop(*year _cons)

** Clear models results
eststo clear


******
******   DID Estimator (reversal: 2008 - 2013) - Table 3
******
** Define list of control variables
local bankCov_pre2010 ln_assets_pre2010 liquidAssetsToStFunding_pre2010 // Bank Control Variables (as of 2010)
local countryCov_pre2010 HHI_pre2010 gdpPerCapita_pre2010 LgdpGrowth_pre2010 L2gdpGrowth_pre2010 creditToPrivate_pre2010 // Country Control Variables (as of 2010)

** Column 1: 	No Fixed Effects or Controls
eststo: reg w_ln_zScore c.crisis_reversal##c.treated if year >= 2008, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 2: 	No Fixed Effect, with Bank Covariates
eststo: reg w_ln_zScore c.crisis_reversal##c.treated `bankCov_pre2010' if year >= 2008, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 3: 	No Fixed Effect, with Bank and Country Covariates
eststo: reg w_ln_zScore c.crisis_reversal##c.treated `bankCov_pre2010' `countryCov_pre2010' if year >= 2008, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 4: 	With Country FE and Bank Covariates
eststo: areg w_ln_zScore c.crisis_reversal##c.treated `bankCov_pre2010' if year >= 2008, absorb(countryIndex) cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "yes"
estadd local year_fe "no"

** Column 5: 	With Bank FE (exclude banks with exactly one observation, i.e. "N_indexNumber" > 1
egen N_indexNumber_reversal = count(indexNumber) if year>=2008, by (indexNumber)
eststo: areg w_ln_zScore crisis_reversal c.crisis_reversal#c.treated if year >= 2008 & N_indexNumber_reversal > 1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "no"

** Column 6: 	With Bank FE and yearly dummies (exclude banks with exactly one observation)
eststo: areg w_ln_zScore c.crisis_reversal#c.treated i.year if year >= 2008 & N_indexNumber_reversal > 1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "yes"

** Export table to file
esttab using "Table3.csv", replace scsv scalars(bank_fe country_fe year_fe N_clust) se r2 drop(*year _cons)

** Clear models results
eststo clear


******
******   Treatment effect over time - Figure 3
******
** Estimate regression and store coefficients
eststo did_2005: reg w_ln_zScore crisis treated i.year#c.treated if year<=2010, cluster(indexNumber)

** Draw graph
coefplot did_2005, base omitted yline(0) vertical keep(*.year#c.treated) ///
		mlabel format(%9.2g) mlabposition(3) ///
		xlabel(1 "2005" 2 "2006" 3 "2007" 4 "2008" 5 "2009" 6 "2010") ///
		ytitle("Ln (Z_Score)") ///
		xtitle("Year")

** Graph Export
graph export "Figure_3.png", as(png) replace
			
** Clear models results
eststo clear

			
***********************************************************************************************************
************************************* APPENDIX: UNREPORTED RESULTS ****************************************
***********************************************************************************************************

******
****** 	Mean comparison tests between periods (t-test) for each variable
******

** ln(Z_Score)
ttest w_ln_zScore if year>=2005 & year<=2007, by (treated) 
ttest w_ln_zScore if year>=2008 & year<=2010, by (treated) 
** Liquidity
ttest liquidAssetsToStFunding if year>=2005 & year<=2007, by (treated) 
ttest liquidAssetsToStFunding if year>=2008 & year<=2010, by (treated) 
** Total Assets
ttest assets if year>=2005 & year<=2007, by (treated) 
ttest assets if year>=2008 & year<=2010, by (treated) 


******
****** 	Equation 20 estimatation (i.e. DiD expressed in time-differences)
******

** Preserve original dataset
preserve


** Excludes years 2011 onwards and keeps only banks that appear both before and after the shock
keep if year<=2010
gen bank_both=Banks_2005_2007*Banks_2008_2010
keep if bank_both==1

** Calculate mean of ln(Z_Score) by pre/post crisis and group (protected / unprotected)
collapse (mean) w_ln_zScore treated, by(crisis indexNumber)

** Calculate the difference in Z-Scores by bank (post-shock and pre-shock averages)
gen aux1=0
replace aux1=w_ln_zScore if crisis==0
gen aux2=0
replace aux2=w_ln_zScore if crisis==1
sort indexNumber
by indexNumber: egen ln_zScore_pre = total(aux1)  
by indexNumber: egen ln_zScore_post = total(aux2)
gen delta_zScore= ln_zScore_post-ln_zScore_pre  
quietly by indexNumber:  gen dup = cond(_N==1,0,_n)
drop if dup==2
drop aux1 aux2

** Equation 20 model
eststo: reg delta_zScore treated, vce(robust)


** Export table to file
esttab using "Table_Equation20_Unreported.csv", replace scsv scalars(N_clust) se r2 drop(_cons)

** Clear models results
eststo clear

** Restore original dataset
restore


******
******   DID Estimator (2005 - 2010) with only banks that appear both before and after the crisis
******

** Define list of control variables
local bankCov_pre2007 ln_assets_pre2007 liquidAssetsToStFunding_pre2007 // Bank Control Variables (as of 2007)
local countryCov_pre2007 HHI_pre2007 gdpPerCapita_pre2007 LgdpGrowth_pre2007 L2gdpGrowth_pre2007 creditToPrivate_pre2007 // Country Control Variables (as of 2007)

** Column 1: 	No Fixed Effects or Controls
eststo: reg w_ln_zScore c.crisis##c.treated if year <=2010 & Banks_2005_2007==1 & Banks_2008_2010==1, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 2: 	No Fixed Effect, with Bank Covariates
eststo: reg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' if year <=2010 & Banks_2005_2007==1 & Banks_2008_2010==1, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 3: 	No Fixed Effect, with Bank and Country Covariates
eststo: reg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' `countryCov_pre2007' if year <=2010 & Banks_2005_2007==1 & Banks_2008_2010==1, cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "no"
estadd local year_fe "no"

** Column 4: 	With Country FE and Bank Covariates
eststo: areg w_ln_zScore c.crisis##c.treated `bankCov_pre2007' if year <=2010 & Banks_2005_2007==1 & Banks_2008_2010==1, absorb(countryIndex) cluster(indexNumber)
estadd local bank_fe "no"
estadd local country_fe "yes"
estadd local year_fe "no"

** Column 5: 	With Bank FE (exclude banks with exactly one observation, i.e. "N_indexNumber" > 1 )
eststo: areg w_ln_zScore crisis c.crisis#c.treated if year <=2010 & N_indexNumber > 1 & Banks_2005_2007==1 & Banks_2008_2010==1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "no"

** Column 6: 	With Bank FE and yearly dummies
eststo: areg w_ln_zScore c.crisis#c.treated i.year if year <=2010 & N_indexNumber > 1 & Banks_2005_2007==1 & Banks_2008_2010==1, absorb(indexNumber) cluster(indexNumber)
estadd local bank_fe "yes"
estadd local country_fe "---"
estadd local year_fe "yes"

** Export table to file
esttab using "Table_2_BanksBeforeAndAfter_Unreported.csv", replace scsv scalars(bank_fe country_fe year_fe N_clust) se r2 drop(*year _cons)

** Clear models results
eststo clear



******
******  	Treatment effect over time - 2007 as the reference year
******


** Estimate regression and store coefficients
eststo did_2007: reg w_ln_zScore crisis treated b2007.year#c.treated if year<=2010, cluster(indexNumber)

** Draw graph
coefplot did_2007, base omitted yline(0) vertical keep(*.year#c.treated) ///
		mlabel format(%9.2g) mlabposition(3) ///
		xlabel(1 "2005" 2 "2006" 3 "2007" 4 "2008" 5 "2009" 6 "2010") ///
		ytitle("Ln (Z_Score)") ///
		xtitle("Year")

** Graph Export		
graph export "Figure_3_2007_as_reference_Unreported.png", as(png) replace
			
** Clear models results
eststo clear

** END
